54 research outputs found
Optimizing Magnetic Resonance Imaging for Image-Guided Radiotherapy
Magnetic resonance imaging (MRI) is playing an increasingly important role in image-guided radiotherapy. MRI provides excellent soft tissue contrast, and is flexible in characterizing various tissue properties including relaxation, diffusion and perfusion. This thesis aims at developing new image analysis and reconstruction algorithms to optimize MRI in support of treatment planning, target delineation and treatment response assessment for radiotherapy.
First, unlike Computed Tomography (CT) images, MRI cannot provide electron density information necessary for radiation dose calculation. To address this, we developed a synthetic CT generation algorithm that generates pseudo CT images from MRI, based on tissue classification results on MRI for female pelvic patients. To improve tissue classification accuracy, we learnt a pelvic bone shape model from a training dataset, and integrated the shape model into an intensity-based fuzzy c-menas classification scheme. The shape-regularized tissue classification algorithm is capable of differentiating tissues that have significant overlap in MRI intensity distributions. Treatment planning dose calculations using synthetic CT image volumes generated from the tissue classification results show acceptably small variations as compared to CT volumes. As MRI artifacts, such as B1 filed inhomogeneity (bias field) may negatively impact the tissue classification accuracy, we also developed an algorithm that integrates the correction of bias field into the tissue classification scheme. We modified the fuzzy c-means classification by modeling the image intensity as the true intensity corrupted by the multiplicative bias field. A regularization term further ensures the smoothness of the bias field. We solved the optimization problem using a linearized alternating direction method of multipliers (ADMM) method, which is more computational efficient over existing methods.
The second part of this thesis looks at a special MR imaging technique, diffusion-weighted MRI (DWI). By acquiring a series of DWI images with a wide range of b-values, high order diffusion analysis can be performed using the DWI image series and new biomarkers for tumor grading, delineation and treatment response evaluation may be extracted. However, DWI suffers from low signal-to-noise ratio at high b-values, and the multi-b-value acquisition makes the total scan time impractical for clinical use. In this thesis, we proposed an accelerated DWI scheme, that sparsely samples k-space and reconstructs images using a model-based algorithm. Specifically, we built a 3D block-Hankel tensor from k-space samples, and modeled both local and global correlations of the high dimensional k-space data as a low-rank property of the tensor. We also added a phase constraint to account for large phase variations across different b-values, and to allow reconstruction from partial Fourier acquisition, which further accelerates the image acquisition. We proposed an ADMM algorithm to solve the constrained image reconstruction problem. Image reconstructions using both simulated and patient data show improved signal-to-noise ratio. As compared to clinically used parallel imaging scheme which achieves a 4-fold acceleration, our method achieves an 8-fold acceleration. Reconstructed images show reduced reconstruction errors as proved on simulated data and similar diffusion parameter mapping results on patient data.PHDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143919/1/llliu_1.pd
Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning
Recent progress has been made in using attention based encoder-decoder
framework for video captioning. However, most existing decoders apply the
attention mechanism to every generated word including both visual words (e.g.,
"gun" and "shooting") and non-visual words (e.g. "the", "a"). However, these
non-visual words can be easily predicted using natural language model without
considering visual signals or attention. Imposing attention mechanism on
non-visual words could mislead and decrease the overall performance of video
captioning. To address this issue, we propose a hierarchical LSTM with adjusted
temporal attention (hLSTMat) approach for video captioning. Specifically, the
proposed framework utilizes the temporal attention for selecting specific
frames to predict the related words, while the adjusted temporal attention is
for deciding whether to depend on the visual information or the language
context information. Also, a hierarchical LSTMs is designed to simultaneously
consider both low-level visual information and high-level language context
information to support the video caption generation. To demonstrate the
effectiveness of our proposed framework, we test our method on two prevalent
datasets: MSVD and MSR-VTT, and experimental results show that our approach
outperforms the state-of-the-art methods on both two datasets
Frequency Domain Model Augmentation for Adversarial Attack
For black-box attacks, the gap between the substitute model and the victim
model is usually large, which manifests as a weak attack performance. Motivated
by the observation that the transferability of adversarial examples can be
improved by attacking diverse models simultaneously, model augmentation methods
which simulate different models by using transformed images are proposed.
However, existing transformations for spatial domain do not translate to
significantly diverse augmented models. To tackle this issue, we propose a
novel spectrum simulation attack to craft more transferable adversarial
examples against both normally trained and defense models. Specifically, we
apply a spectrum transformation to the input and thus perform the model
augmentation in the frequency domain. We theoretically prove that the
transformation derived from frequency domain leads to a diverse spectrum
saliency map, an indicator we proposed to reflect the diversity of substitute
models. Notably, our method can be generally combined with existing attacks.
Extensive experiments on the ImageNet dataset demonstrate the effectiveness of
our method, \textit{e.g.}, attacking nine state-of-the-art defense models with
an average success rate of \textbf{95.4\%}. Our code is available in
\url{https://github.com/yuyang-long/SSA}.Comment: Accepted by ECCV 202
Software for Automated Comparison of Low Molecular Weight Heparins Using Top-Down LC/MS Data
Low molecular weight heparins are complex polycomponent drugs that have recently become amenable to top-down analysis using liquid chromatography-mass spectrometry. Even using open source deconvolution software, DeconTools, and automatic structural assignment software, GlycReSoft, the comparison of two or more low molecular weight heparins is extremely time-consuming, taking about a week for an expert analyst and provides no guarantee of accuracy. Efficient data processing tools are required to improve analysis. This study uses the programming language of Microsoft Excel™ Visual Basic for Applications to extend its standard functionality for macro functions and specific mathematical modules for mass spectrometric data processing. The program developed enables the comparison of top-down analytical glycomics data on two or more low molecular weight heparins. The current study describes a new program, GlycCompSoft, which has a low error rate with good time efficiency in the automatic processing of large data sets. The experimental results based on three lots of Lovenox®, Clexane® and three generic enoxaparin samples show that the run time of GlycCompSoft decreases from 11 to 2 seconds when the data processed decreases from 18000 to 1500 rows
Local-Global Information Interaction Debiasing for Dynamic Scene Graph Generation
The task of dynamic scene graph generation (DynSGG) aims to generate scene
graphs for given videos, which involves modeling the spatial-temporal
information in the video. However, due to the long-tailed distribution of
samples in the dataset, previous DynSGG models fail to predict the tail
predicates. We argue that this phenomenon is due to previous methods that only
pay attention to the local spatial-temporal information and neglect the
consistency of multiple frames. To solve this problem, we propose a novel
DynSGG model based on multi-task learning, DynSGG-MTL, which introduces the
local interaction information and global human-action interaction information.
The interaction between objects and frame features makes the model more fully
understand the visual context of the single image. Long-temporal human actions
supervise the model to generate multiple scene graphs that conform to the
global constraints and avoid the model being unable to learn the tail
predicates. Extensive experiments on Action Genome dataset demonstrate the
efficacy of our proposed framework, which not only improves the dynamic scene
graph generation but also alleviates the long-tail problem.Comment: The author has withdrawn this paper due to a critical definitional
error in multi-task learning for dynamic SGG debiasing. This error aligned
with the definition of dynamic SGG tasks, resulting in an unfair comparison
with state-of-the-art (SOTA) methods, which in turn, hindered the ability to
evaluate the paper's contribution
Copper chloride-catalyzed efficient three-component one-pot synthesis of carbamatoalkyl naphthols under solvent-free conditions
740-745A highly efficient
synthesis of carbamatoalkyl naphthols by a one-pot three-component condensation
of 2-naphthol, aldehydes, and methyl/ethyl/benzyl carbamates in the presence of
copper chloride under thermal solvent-free conditions has been performed. The
suitable solvent, amounts of raw materials and catalyst, and reaction
temperature were investigated. Experimental results show that only 1 mol%
catalyst was enough to induce the conversion. All new products were
characterized by mp, IR, 1H NMR, 13C NMR and mass spectrum. A mechanism to
rationalize the reaction was proposed
Quantization-based hashing: a general framework for scalable image and video retrieval
Nowadays, due to the exponential growth of user generated images and videos, there is an increasing interest in learning-based hashing methods. In computer vision, the hash functions are learned in such a way that the hash codes can preserve essential properties of the original space (or label information). Then the Hamming distance of the hash codes can approximate the data similarity. On the other hand, vector quantization methods quantize the data into different clusters based on the criteria of minimal quantization error, and then perform the search using look-up tables. While hashing methods using Hamming distance can achieve faster search speed, their accuracy is often outperformed by quantization methods with the same code length, due to the low quantization error and more flexible distance lookups. To improve the effectiveness of the hashing methods, in this work, we propose Quantization-based Hashing (QBH), a general framework which incorporates the advantages of quantization error reduction methods into conventional property preserving hashing methods. The learned hash codes simultaneously preserve the properties in the original space and reduce the quantization error, and thus can achieve better performance. Furthermore, the hash functions and a quantizer can be jointly learned and iteratively updated in a unified framework, which can be readily used to generate hash codes or quantize new data points. Importantly, QBH is a generic framework that can be integrated to different property preserving hashing methods and quantization strategies, and we apply QBH to both unsupervised and supervised hashing models as showcases in this paper. Experimental results on three large-scale unlabeled datasets (i.e., SIFT1M, GIST1M, and SIFT1B), three labeled datastes (i.e., ESPGAME, IAPRTC and MIRFLICKR) and one video dataset (UQ_VIDEO) demonstrate the superior performance of our QBH over existing unsupervised and supervised hashing methods
- …